Weighted distance measures for metabolomic data
نویسندگان
چکیده
Motivation: Many analyses of metabolomic data depend on the choice of distance measure, but it is unclear how to make an appropriate choice. The choice is especially unclear when the variance in metabolite abundance is not constant. Methods: We describe a class of weighted distance measures that account for non-constant variance in metabolite abundance by using a model of the relationship between the variance and the mean. We develop two methods to assess the performance of a distance measure. One method measures repeatability across bootstrap collections of metabolites; the second assesses agreement with patterns expected to be found in the data. These methods were used to compare seven distance measures by evaluating data on 58 metabolites measured in 40 accessions of Echinacea, a genus of plants widely used as botanical supplements. Results: A new precision-weighted Manhattan distance and the Canberra distance are the most repeatable and the most in agreement with the expected pattern. Distances based on standardized data are intermediate in performance, and unweighted Manhattan or Euclidean distance measures are the least repeatable and the least in
منابع مشابه
Rapid execution of weighted edit distances
The comparison of large numbers of strings plays a central role in ontology matching, record linkage and link discovery. While several standard string distance and similarity measures have been developed with these explicit goals in mind, similarities and distances learned out of the data have been shown to often perform better with respect to the F-measure that they can achieve. Still, the pra...
متن کاملDecision Making with Distance Measures, Weighted Averages and Induced Owa Operators
We develop a new decision making model by using distance measures, weighted averages and OWA operators. We introduce the induced ordered weighted averaging – weighted averaging distance (IOWAWAD) operator. We study some of its main properties and particular cases such as the weighted Hamming distance, the induced OWA distance (IOWAD), the arithmetic weighted distance and the arithmetic IOWAD op...
متن کاملFirst Name Last Name Title
Applying weighted network measures to distance matrices Many approaches to the analysis of weighted networks are not designed for fully connected weighted networks. However, as any distance matrix between objects is a fully connected weighted network, such networks are extremely common. In earlier work we derived an approach for the analysis of weighted networks which also works on fully connec...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کامل